Search CORE

577 research outputs found

Optimally splitting cases for training and testing high dimensional classifiers

Author: A Dupuy
A Rosenwald
AM Molinaro
B Efron
C Ambroise
J Schafer
JM Boer
K Fukunaga
K Shedden
Kevin K Dobbin
KI Kim
KK Dobbin
KK Dobbin
L Devroye
L Sun
LJ van't Veer
MD Radmacher
O Ledoit
R Simon
Richard M Simon
RO Duda
S Mukherjee
TR Golub
WJ Fu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background We consider the problem of designing a study to develop a predictive classifier from high dimensional data. A common study design is to split the sample into a training set and an independent test set, where the former is used to develop the classifier and the latter to evaluate its performance. In this paper we address the question of what proportion of the samples should be devoted to the training set. How does this proportion impact the mean squared error (MSE) of the prediction accuracy estimate? Results We develop a non-parametric algorithm for determining an optimal splitting proportion that can be applied with a specific dataset and classifier algorithm. We also perform a broad simulation study for the purpose of better understanding the factors that determine the best split proportions and to evaluate commonly used splitting strategies (1/2 training or 2/3 training) under a wide variety of conditions. These methods are based on a decomposition of the MSE into three intuitive component parts. Conclusions By applying these approaches to a number of synthetic and real microarray datasets we show that for linear classifiers the optimal proportion depends on the overall number of samples available and the degree of differential expression between the classes. The optimal proportion was found to depend on the full dataset size (n) and classification accuracy - with higher accuracy and smaller <it>n </it>resulting in more assigned to the training set. The commonly used strategy of allocating 2/3rd of cases for training was close to optimal for reasonable sized datasets (<it>n </it>≥ 100) with strong signals (i.e. 85% or greater full dataset accuracy). In general, we recommend use of our nonparametric resampling approach for determing the optimal split. This approach can be applied to any dataset, using any predictor development method, to determine the best split.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Defining adequate contact for transmission of Mycobacterium tuberculosis in an African urban environment

Author: Castellanos María Eugenia
Dobbin Kevin K.
Ebell Mark H.
Kakaire Robert
Kiwanuka Noah
Sekandi Juliet
Whalen Christopher C.
Zalwango Sarah
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Background The risk of infection from respiratory pathogens increases according to the contact rate between the infectious case and susceptible contact, but the definition of adequate contact for transmission is not standard. In this study we aimed to identify factors that can explain the level of contact between tuberculosis cases and their social networks in an African urban environment. Methods This was a cross-sectional study conducted in Kampala, Uganda from 2013 to 2017. We carried out an exploratory factor analysis (EFA) in social network data from tuberculosis cases and their contacts. We evaluated the factorability of the data to EFA using the Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO). We used principal axis factoring with oblique rotation to extract and rotate the factors, then we calculated factor scores for each using the weighted sum scores method. We assessed construct validity of the factors by associating the factors with other variables related to social mixing. Results Tuberculosis cases (N = 120) listed their encounters with 1154 members of their social networks. Two factors were identified, the first named “Setting” captured 61% of the variance whereas the second, named ‘Relationship’ captured 21%. Median scores for the setting and relationship factors were 10.2 (IQR 7.0, 13.6) and 7.7 (IQR 6.4, 10.1) respectively. Setting and Relationship scores varied according to the age, gender, and nature of the relationship among tuberculosis cases and their contacts. Family members had a higher median setting score (13.8, IQR 11.6, 15.7) than non-family members (7.2, IQR 6.2, 9.4). The median relationship score in family members (9.9, IQR 7.6, 11.5) was also higher than in non-family members (6.9, IQR 5.6, 8.1). For both factors, household contacts had higher scores than extra-household contacts (p < .0001). Contacts of male cases had a lower setting score as opposed to contacts of female cases. In contrast, contacts of male and female cases had similar relationship scores. Conclusions In this large cross-sectional study from an urban African setting, we identified two factors that can assess adequate contact between tuberculosis cases and their social network members. These findings also confirm the complexity and heterogeneity of social mixing

ResearchOnline at James Cook University

Statistical methodology for the analysis of dye-switch microarray experiments

Author: A Whitehead
E Wit
E Wit
G Churchill
G Smyth
J Landgrebe
J Landgrebe
Jean-Jacques Daudin
Julie Aubert
K Dobbin
K Dobbin
M Kerr
M Kerr
M Martin-Magniette
M Oleksiak
N Mansouri
Nadera Mansouri-Attia
Olivier Sandra
P Baldi
P Delmar
Tristan Mary-Huard
V Tusher
Y Yang
Y Yang
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background In individually dye-balanced microarray designs, each biological sample is hybridized on two different slides, once with <it>Cy3 </it>and once with <it>Cy5</it>. While this strategy ensures an automatic correction of the gene-specific labelling bias, it also induces dependencies between log-ratio measurements that must be taken into account in the statistical analysis. Results We present two original statistical procedures for the statistical analysis of individually balanced designs. These procedures are compared with the usual ML and REML mixed model procedures proposed in most statistical toolboxes, on both simulated and real data. Conclusion The UP procedure we propose as an alternative to usual mixed model procedures is more efficient and significantly faster to compute. This result provides some useful guidelines for the analysis of complex designs.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

HAL Descartes

ProdInra

Cardiotoxicity and myocardial hypoperfusion associated with anti‐vascular endothelial growth factor therapies: prospective cardiac magnetic resonance imaging in patients with cancer

Author: Basak S.
Berry C.
Dobbin S.J.H.
Jones R.J.
Lang N.N.
Mangion K.
Petrie M.C.
Roditi G.
Sourbron S.
Touyz R.M.
Venugopal B.
White J.
Publication venue: 'Wiley'
Publication date: 07/05/2020
Field of study

No abstract available

Crossref

Enlighten

White Rose Research Online

Validity of an isometric mid-thigh pull dynamometer in male youth athletes

Author: Dobbin N
Hunwicks R
Jones B
Morris R
Stokes K
Till KA
Trewartha G
Twist C
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/02/2018
Field of study

The purpose of the present study was to investigate the validity of an isometric mid-thigh pull dynamometer against a criterion measure (i.e., 1,000 Hz force platform) for assessing muscle strength in male youth athletes. Twenty-two male adolescent (age 15.3 ± 0.5 years) rugby league players performed four isometric mid-thigh pull efforts (i.e., two on the dynamometer and two on the force platform) separated by 5 minutes rest in a randomised and counterbalanced order. Mean bias, typical error of estimate (TEE) and Pearson correlation coefficient for peak force (PF) and peak force minus body weight (PFBW) from the force platform were validated against peak force from the dynamometer (DynoPF). When compared to PF and PFBW, mean bias (with 90% Confidence limits) for DynoPF was very large (-32.4 [-34.2 to -30.6] %) and moderate (-10.0 [-12.8 to -7.2] %), respectively. The TEE was moderate for both PF (8.1 [6.3 to 11.2] %) and PFBW (8.9 [7.0 to 12.4]). Correlations between DynoPF and PF (r 0.90 [0.79 to 0.95]) and PFBW (r 0.90 [0.80 to 0.95] were nearly perfect. The isometric mid-thigh pull assessed using a dynamometer underestimated PF and PFBW obtained using a criterion force platform. However, strong correlations between the dynamometer and force platform suggest that a dynamometer provides an appropriate alternative to assess isometric mid-thigh pull strength when a force platform is not available. Therefore, practitioners can use an isometric mid-thigh pull dynamometer to assess strength in the field with youth athletes but should be aware that it underestimates peak force

Crossref

ChesterRep

E-space: Manchester Metropolitan University's Research Repository

Leeds Beckett Repository

Human basal-like breast cancer is represented by one of the two mammary tumor subtypes in dogs.

Author: Dobbin Kevin K
Feng Yuan
Ho Kun-Lin
Mahawan Tanakamol
Wang Tianfang
Watson Joshua
Zhao Shaying
Publication venue
Publication date: 03/10/2023
Field of study

BackgroundAbout 20% of breast cancers in humans are basal-like, a subtype that is often triple-negative and difficult to treat. An effective translational model for basal-like breast cancer is currently lacking and urgently needed. To determine whether spontaneous mammary tumors in pet dogs could meet this need, we subtyped canine mammary tumors and evaluated the dog-human molecular homology at the subtype level.MethodsWe subtyped 236 canine mammary tumors from 3 studies by applying various subtyping strategies on their RNA-seq data. We then performed PAM50 classification with canine tumors alone, as well as with canine tumors combined with human breast tumors. We identified feature genes for human BLBC and luminal A subtypes via machine learning and used these genes to repeat canine-alone and cross-species tumor classifications. We investigated differential gene expression, signature gene set enrichment, expression association, mutational landscape, and other features for dog-human subtype comparison.ResultsOur independent genome-wide subtyping consistently identified two molecularly distinct subtypes among the canine tumors. One subtype is mostly basal-like and clusters with human BLBC in cross-species PAM50 and feature gene classifications, while the other subtype does not cluster with any human breast cancer subtype. Furthermore, the canine basal-like subtype recaptures key molecular features (e.g., cell cycle gene upregulation, TP53 mutation) and gene expression patterns that characterize human BLBC. It is enriched in histological subtypes that match human breast cancer, unlike the other canine subtype. However, about 33% of canine basal-like tumors are estrogen receptor negative (ER-) and progesterone receptor positive (PR+), which is rare in human breast cancer. Further analysis reveals that these ER-PR+ canine tumors harbor additional basal-like features, including upregulation of genes of interferon-γ response and of the Wnt-pluripotency pathway. Interestingly, we observed an association of PGR expression with gene silencing in all canine tumors and with the expression of T cell exhaustion markers (e.g., PDCD1) in ER-PR+ canine tumors.ConclusionsWe identify a canine mammary tumor subtype that molecularly resembles human BLBC overall and thus could serve as a vital translational model of this devastating breast cancer subtype. Our study also sheds light on the dog-human difference in the mammary tumor histology and the hormonal cycle

University of Liverpool Repository

Analysis of global transcriptional responses of chicken following primary and secondary Eimeria acervulina infections

Author: Calvin L Keeler
CH Kim
CH Kim
Chul-Hong Kim
DA Hosack
Erik P Lillehoj
Hyun S Lillehoj
JP Townsend
K Dobbin
LA Cogburn
PY Muller
RA Dalloul
RB Williams
Yeong-Ho Hong
YH Hong
YH Hong
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Characterization of Shewanella oneidensis MtrC: a cell-surface decaheme cytochrome involved in respiratory electron transport to extracellular electron acceptors

Author: AI Tsapin
AJ Bard
AS Beliaev
AS Beliaev
BH Kim
Brian N. Jepson
C Myers
C Myers
C Schwalb
CR Myers
CR Myers
CR Myers
CR Myers
D Leys
David J. Richardson
DJ Richardson
E Berry
EHJ Gordon
FA Walker
GR Moore
HJ Kim
I Gautier-Luneau
J Hirst
JF Heidelberg
Jim Fredrickson
JK Fredrickson
JM Myers
JM Myers
JM Philo
John Zachara
Julea N. Butt
K Venkateswaren
KE Pitts
KH Nealson
KH Nealson
Kurnikov
L Shi
L Shi
Liang Shi
LJ Anderson
MG Almeida
MP Hendrich
MP Hendrich
PL Dutton
PS Dobbin
R Aasa
Robert S. Hartshorne
Sarah J. Field
SJ Field
T Horan
TA Clarke
Tom A. Clarke
VA Bamford
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/08/2007
Field of study

MtrC is a decaheme c-type cytochrome associated with the outer cell membrane of Fe(III)-respiring species of the Shewanella genus. It is proposed to play a role in anaerobic respiration by mediating electron transfer to extracellular mineral oxides that can serve as terminal electron acceptors. The present work presents the first spectropotentiometric and voltammetric characterization of MtrC, using protein purified from Shewanella oneidensis MR-1. Potentiometric titrations, monitored by UV–vis absorption and electron paramagnetic resonance (EPR) spectroscopy, reveal that the hemes within MtrC titrate over a broad potential range spanning between approximately +100 and approximately -500 mV (vs. the standard hydrogen electrode). Across this potential window the UV–vis absorption spectra are characteristic of low-spin c-type hemes and the EPR spectra reveal broad, complex features that suggest the presence of magnetically spin-coupled low-spin c-hemes. Non-catalytic protein film voltammetry of MtrC demonstrates reversible electrochemistry over a potential window similar to that disclosed spectroscopically. The voltammetry also allows definition of kinetic properties of MtrC in direct electron exchange with a solid electrode surface and during reduction of a model Fe(III) substrate. Taken together, the data provide quantitative information on the potential domain in which MtrC can operate

Crossref

University of East Anglia digital repository

Valtion aluehallintovirastot ja niiden ylijohtajat: Pohjoiseurooppalainen analogia Ranskan prefeikteille

Author: C Pollitt
C Pollitt
C Pollitt
C Zietsma
CK Lee
D Held
DA Gioia
E Hepburn
F Dobbin
G Hammerschmid
JW Meyer
JW Meyer
K Palonen
KM Guenther
M Hensmans
MW Dirsmith
N Brunsson
OECD
T Christensen
WW Powell
Publication venue: Palgrave Macmillan
Publication date: 29/12/2020
Field of study

This chapter examines the closest Finnish analogy to the French function of the prefect. In Finland, since 2010, this function has been vested in the institution of the State Regional Administrative Agency (SRAA, aluehallintovirasto, ‘AVI’). There are six SRAAs, each headed by a Chief Director (ylijohtaja) nominated by the government. The study had four main findings. First, despite ambiguity in institutional terminology, classifications, boundaries and identities concerning the SRAA, one can discern few true functional or structural deficiencies. Second, the SRAA is a hybrid between an institution of its own and a territorial representative of either government ministries or government agencies, to which is related the fact that each SRAA has both responsibilities concerning its territory and nationwide responsibilities. Third, tensions between performance and institutional legitimation prevail in the institution of the SRAA, but again without serious deficiencies. Fourth, the 2010 substitution of the SRAA for the former Province comprised a radical institutional change. The 2015–2019 Finnish government intended to abolish the SRAAs, but the subsequent government abandoned that reform, and ultimately by mid-2020 it became clear that the institution of the SRAA was here to stay after all.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto